LONGER−LENGTH ACOUSTIC UNITS FOR CONTINUOUS SPEECH RECOGNITION (ThuAmPO1)

نویسندگان

  • Annika Hämäläinen
  • Johan De Veth
  • Louis Boves
چکیده

Recent research on the TIMIT database suggests that longer−length acoustic units are better suited for modelling pronunciation variation and long−term temporal dependencies in speech than traditional phoneme−length units, yielding substantial improvements in recognition accuracy [9]. In this paper, we investigate whether similar improvements can be gained on another database, viz. excerpts from novels in a Dutch library for the blind. We use a hierarchical method that employs a mixture of word−, syllable− and phoneme−length units. Our results show that the approach does increase the word accuracy, but to a lesser extent than expected. The paper discusses possible explanations for the finding.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inference of variable-length acoustic units for continuous speech recognition

In the eld of speech recognition, the patterns assumed to structure the speech material (phonemes, triphones, words...) are de ned a priori according to a linguistic criterion, whereas the recognition criterion is based on an acoustic similarity measure. From this may result a lack of consistency for the recognition units. In this paper, we explore the possibility of a more data-driven approach...

متن کامل

Syllable-Length Acoustic Units in Large-Vocabulary Continuous Speech Recognition

Recent research on the TIMIT corpus suggests that longerlength acoustic units are better suited for modelling coarticulation and long-term temporal dependencies in speech than conventional context-dependent phone models. However, the impressive results achieved on TIMIT [1] are yet to be reproduced on other corpora, such as read speech from the Spoken Dutch Corpus. Differences between TIMIT and...

متن کامل

Split-lexicon based hierarchical recognition of speech using syllable and word level acoustic units

Most speech recognition systems, especially LVCSR, use context dependent phones as the basic acoustic unit for recognition. The primary motive for this is the relative ease with which phone based systems can be trained robustly with small amounts of data. However as recent research indicates, significant improvements in recognition accuracy can be gained by using acoustic units of longer durati...

متن کامل

Inference of variable-length linguistic and acoustic units by multigrams

The efficiency of pattern recognition algorithms is highly conditioned to a proper definition of the patterns assumed to structure the data. The multigram model provides a statistical tool to retrieve sequential variable-length regularities within streams of data. In this paper, we present a general formulation of the model, applicable to single or multiple parallel strings of data having eithe...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005